PyTorch 소개: 텐서가 중요한 이유

PyTorch는 딥러닝 연구 및 빠른 프로토타이핑에 적합한 매우 유연하고 동적인 오픈소스 프레임워크입니다. 핵심은 텐서 이며, 딥러닝 모델에 필요한 수치 연산을 효율적으로 처리할 수 있도록 설계된 다차원 배열입니다. 이는 자동으로 GPU 가속 를 지원합니다.

1. 텐서 구조 이해하기

PyTorch의 모든 입력, 출력 및 모델 파라미터는 텐서 안에 포함됩니다. 이들은 넘파이 배열과 같은 목적을 수행하지만, 특히 GPU과 같은 특수 하드웨어에서 처리되도록 최적화되어 있어 신경망에 필요한 대규모 선형 대수 연산에 훨씬 더 효율적입니다.

텐서를 정의하는 주요 속성:

형태(Shape): 데이터의 차원을 정의하며, 튜플 형식으로 표현됩니다 (예: 이미지 배치의 경우 $4 \times 32 \times 32$).
데이터 타입(Dtype): 저장된 요소의 숫자 타입을 지정합니다 (예: 모델 가중치의 경우 torch.float32 또는 인덱싱의 경우 torch.int64 사용).
장치(디바이스): 물리적 하드웨어 위치를 나타냅니다. 일반적으로 'cpu' 또는 'cuda' (NVIDIA GPU)입니다.

동적 그래프와 자동 미분 (Autograd)

PyTorch는 명령형 실행 모델을 사용하여 연산이 실행될 때 계산 그래프가 생성됩니다. 이를 통해 내장된 자동 미분 엔진인 Autograd가 텐서의 모든 연산을 추적할 수 있게 해줍니다. 단, 속성 requires_grad=True 이 설정되어야 합니다. 이를 통해 역전파 중 기울기 계산이 간편해집니다.

TERMINALbash — pytorch-env

> Ready. Click "Run" to execute.

TENSOR INSPECTOR Live

Run code to inspect active tensors

Question 1

Which command creates a $5 \times 5$ tensor containing random numbers following a uniform distribution between 0 and 1?

torch.rand(5, 5)

torch.random(5, 5)

torch.uniform(5, 5)

torch.randn(5, 5)

Question 2

If tensor $A$ is on the CPU, and tensor $B$ is on the CUDA device, what happens if you try to compute $A + B$?

An error occurs because operations require tensors on the same device.

PyTorch automatically moves $A$ to the CUDA device and proceeds.

The operation is performed on the CPU, and the result is returned to the CPU.

Question 3

What is the most common data type (dtype) used for model weights and intermediate calculations in Deep Learning?

torch.float32 (single-precision floating point)

torch.int64 (long integer)

torch.bool

torch.float64 (double-precision floating point)

Challenge: Tensor Manipulation and Shape

Prepare a tensor for a specific matrix operation.

You have a feature vector $F$ of shape $(10,)$. You need to multiply it by a weight matrix $W$ of shape $(10, 5)$. For matrix multiplication (MatMul) to work, $F$ must be 2-dimensional.

Step 1

What should the shape of $F$ be before multiplication with $W$?

Solution:
The inner dimensions must match, so $F$ must be $(1, 10)$. Then $(1, 10) @ (10, 5) \rightarrow (1, 5)$.
Code: F_new = F.unsqueeze(0) or F_new = F.view(1, -1)

Step 2

Perform the matrix multiplication between $F_{new}$ and $W$ (shape $(10, 5)$).

Solution:
The operation is straightforward MatMul.
Code: output = F_new @ W or output = torch.matmul(F_new, W)

Step 3

Which method explicitly returns a tensor with the specified dimensions, allowing you to flatten the tensor back to $(50,)$? (Assume $F$ was $(5, 10)$ initially and is now flattened.)

Solution:
Use the view or reshape methods. The fastest way to flatten is often using -1 for one dimension.
Code: F_flat = F.view(-1) or F_flat = F.reshape(50)